Luoyu Zhang, MIT ID: 914165258 , Email: luoyu@mit.edu¶

Yutong Lin, MIT ID:922403517, Email:brenda_l@mit.edu¶

In [ ]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")

Q1¶

Use of PERMNO vs. TICKERS:¶

PERMNO is a unique identifier assigned by the Center for Research in Security Prices (CRSP) to each security in the market. It is a consistent way to track securities over time, regardless of changes in ticker symbols, company names, or corporate actions such as mergers and acquisitions. This consistency is important for analysis over long periods, as it allows for the unambiguous identification of securities.

TICKERS are symbols assigned to securities traded on public exchanges. These can change over time due to various reasons, such as rebranding, mergers, or moving between exchanges, which can create confusion in long-term analysis.

Q2¶

Discrepancies between One-Month Price Change and PRC Column:¶

a. This discrepancy is not data error. The PRC column reflects the total return, and it includes the price appreciation/depreciation, dividends, and other distributions paid to investors. Total return is a comprehensive view of a stock's performance.

The percent changes of price simply reflects the movement in the security's market price. This calculation does not account for dividends or other returns of holding the security. It only focuses on price appreciation/depreciation.

b. For some companies, these values may always match. Because there are no dividends or distributions, or any other corporate actions that might affect the total return.

In [ ]:
# Price_Ret(T1)

file_path = 'sp500raw.xlsx'
sp500_data = pd.read_excel(file_path)
sp500_data.drop(columns=['Unnamed: 0'], inplace =True)
sp500_data['date'] = pd.to_datetime(sp500_data['date'])
sp500_data.sort_values(by=['permno', 'date'], inplace=True)

sp500_data['Price_Ret(T1)'] = sp500_data.groupby('permno')['price'].pct_change()


display(sp500_data.head())
permno date price shrout prc mcap Price_Ret(T1)
48 10104 2011-01-31 32.0300 5052420 0.024920 1.618290e+08 NaN
719 10104 2011-02-28 32.9000 5061000 0.027162 1.665069e+08 0.027162
1100 10104 2011-03-31 33.4325 5060516 0.016185 1.691857e+08 0.016185
1836 10104 2011-04-29 35.9600 5060516 0.077395 1.819762e+08 0.075600
2410 10104 2011-05-31 34.2200 5068000 -0.048387 1.734270e+08 -0.048387

Q3¶

Variability in the Number of Companies:¶

The number of companies in the S&P 500 index does not always precisely equal 500 due to several factors, including mergers, acquisitions, bankruptcies, and the addition or removal of companies based on the market capitalization criteria set by the index. These events can cause the number of constituents to fluctuate over time. So it's not a mistake. It reflects the dynamic nature of the stock market and the index's composition adjustments to maintain its representation of the U.S. economy's leading companies.

Q4¶

303 companies are present over the entire sample¶

761 unique companies are in the sample¶

In [ ]:
date_counts = sp500_data.groupby('permno').size()
full_presence_companies = date_counts[date_counts == sp500_data['date'].nunique()]

unique_companies = sp500_data['permno'].nunique()

(full_presence_companies.count(), unique_companies)
Out[ ]:
(303, 761)

Q5¶

First, we plot the time series of market cap for all stocks each month

In [ ]:
mcap_plot = sp500_data.groupby('date')['mcap'].describe(percentiles=[0.05,0.25,0.75,0.95]).reset_index()

col_plot = ['min', '5%', '25%', '50%', '75%', '95%','max']
df_list = []
for i in col_plot:
    tmp_df = mcap_plot[['date',i]]
    tmp_df['flag'] = i
    tmp_df.rename(columns={i:'mcap'},inplace=True)
    df_list.append(tmp_df)
mcap_final = pd.concat(df_list,axis=0)

plt.figure(figsize=(8,3))
plt.title('Different percentile for market cap')
sns.lineplot(data=mcap_final,x = 'date',y = 'mcap',hue='flag')
Out[ ]:
<Axes: title={'center': 'Different percentile for market cap'}, xlabel='date', ylabel='mcap'>
No description has been provided for this image
In [ ]:
nrows = 4
ncols = 2
fig, axes = plt.subplots(nrows, ncols, figsize=(12, 8))  

axes_flat = axes.flatten()

for i, col in enumerate(col_plot):
    # Filter the data for the current percentile
    data = mcap_plot[['date', col]].copy()
    data.rename(columns={col: 'mcap'}, inplace=True)
    
    # Plot on the i-th subplot in the flattened axes array
    sns.lineplot(ax=axes_flat[i], data=data, x='date', y='mcap')
    axes_flat[i].set_title(f'Market Cap {col}')
    axes_flat[i].set_ylabel('Market Cap')
    axes_flat[i].set_xlabel('Date')

if len(col_plot) % 2 != 0:
    axes_flat[-1].set_visible(False)

plt.tight_layout()
plt.show()
No description has been provided for this image

Q5 a¶

Calculate the percentile range for the month prior stock leaving

In [ ]:
percentiles=[0, 0.05,0.25,0.50, 0.75,0.95, 1]

last_dates = sp500_data.groupby('permno')['date'].max()
one_month_prior = last_dates - pd.DateOffset(months=1)

# Step 2: Convert 'one_month_prior' to a DataFrame for merging
one_month_prior_df = one_month_prior.reset_index()
one_month_prior_df.columns = ['permno', 'date']

prior_to_leaving = pd.merge(sp500_data, one_month_prior_df, how='inner', on=['permno', 'date'])
prior_to_leaving['date'] = prior_to_leaving['date'].dt.to_period('M')

# Calculate the quantiles of mcap for all companies on each date
prior_percentiles = prior_to_leaving.groupby('date')['mcap'].quantile(np.array(percentiles)).unstack()

# # Merge the permno information with the prior_percentiles DataFrame
# prior_percentiles = pd.merge(prior_percentiles, prior_to_leaving[['date', 'permno']], on='date')

prior_percentiles
Out[ ]:
0.00 0.05 0.25 0.50 0.75 0.95 1.00
date
2011-02 2075951.64 3.072294e+06 7.057662e+06 1.203937e+07 1.589526e+07 1.897997e+07 19751150.10
2011-09 1119252.00 1.452419e+06 2.785089e+06 4.450926e+06 6.116763e+06 7.449433e+06 7782600.00
2011-11 1803942.70 1.803943e+06 1.803943e+06 1.803943e+06 1.803943e+06 1.803943e+06 1803942.70
2012-02 26233301.16 2.623330e+07 2.623330e+07 2.623330e+07 2.623330e+07 2.623330e+07 26233301.16
2012-03 1212050.28 2.275960e+06 6.531599e+06 1.185115e+07 1.734458e+07 2.173933e+07 22838013.00
2012-04 3377462.40 3.860524e+06 5.792769e+06 8.208075e+06 1.062338e+07 1.255563e+07 13038687.68
2012-07 5268457.95 5.268458e+06 5.268458e+06 5.268458e+06 5.268458e+06 5.268458e+06 5268457.95
2012-11 2062613.50 2.062614e+06 2.062614e+06 2.062614e+06 2.062614e+06 2.062614e+06 2062613.50
2012-12 1647663.24 1.647663e+06 1.647663e+06 1.647663e+06 1.647663e+06 1.647663e+06 1647663.24
2013-04 2508740.00 3.546321e+06 7.696646e+06 1.288455e+07 1.807246e+07 2.222278e+07 23260362.54
2013-08 24212741.13 2.421274e+07 2.421274e+07 2.421274e+07 2.421274e+07 2.421274e+07 24212741.13
2013-09 10210123.72 1.021012e+07 1.021012e+07 1.021012e+07 1.021012e+07 1.021012e+07 10210123.72
2014-02 3066332.61 3.066333e+06 3.066333e+06 3.066333e+06 3.066333e+06 3.066333e+06 3066332.61
2014-04 3096085.00 3.096085e+06 3.096085e+06 3.096085e+06 3.096085e+06 3.096085e+06 3096085.00
2014-05 3333934.08 4.459060e+06 8.959565e+06 1.458520e+07 2.021083e+07 2.471133e+07 25836459.32
2014-06 3970910.59 3.970911e+06 3.970911e+06 3.970911e+06 3.970911e+06 3.970911e+06 3970910.59
2015-06 13297233.60 1.329723e+07 1.329723e+07 1.329723e+07 1.329723e+07 1.329723e+07 13297233.60
2015-07 15468946.30 1.546895e+07 1.546895e+07 1.546895e+07 1.546895e+07 1.546895e+07 15468946.30
2015-09 2298075.78 2.606799e+06 3.841693e+06 5.385310e+06 1.102737e+07 1.554102e+07 16669427.56
2015-10 15915135.35 1.620842e+07 1.738157e+07 1.884801e+07 2.031444e+07 2.148759e+07 21780876.17
2015-11 1851137.93 3.240462e+06 8.797758e+06 1.574438e+07 2.269100e+07 2.824830e+07 29637619.68
2016-01 1818688.76 1.891327e+06 2.181881e+06 4.534724e+06 8.410461e+06 1.235596e+07 13342339.50
2016-02 2445489.78 2.523534e+06 2.835710e+06 3.225929e+06 7.893580e+06 1.162770e+07 12561230.44
2016-11 7732550.52 8.477358e+06 1.145659e+07 1.518063e+07 1.890467e+07 2.188390e+07 22628707.20
2017-02 3739634.54 3.746957e+06 3.776248e+06 3.812861e+06 3.849475e+06 3.878766e+06 3886088.34
2017-06 4269163.76 4.729676e+06 6.571726e+06 8.874287e+06 1.117685e+07 1.301890e+07 13479411.00
2017-07 6665636.95 1.313331e+07 3.900398e+07 7.134233e+07 7.496822e+07 7.786893e+07 78594106.80
2018-02 6643920.78 6.643921e+06 6.643921e+06 6.643921e+06 6.643921e+06 6.643921e+06 6643920.78
2018-04 3451932.45 3.463228e+06 3.508410e+06 4.910450e+06 5.532378e+07 7.039583e+07 74163841.20
2018-07 14523478.01 1.452348e+07 1.452348e+07 1.452348e+07 1.452348e+07 1.452348e+07 14523478.01
2018-11 6654602.54 6.654603e+06 6.654603e+06 6.654603e+06 6.654603e+06 6.654603e+06 6654602.54
2018-12 2937233.62 3.027302e+06 3.387578e+06 3.837922e+06 4.288266e+06 4.648541e+06 4738610.11
2019-02 4517462.40 4.517462e+06 4.517462e+06 4.517462e+06 4.517462e+06 4.517462e+06 4517462.40
2019-04 4210730.75 4.277526e+06 4.544705e+06 4.878678e+06 5.212652e+06 5.479831e+06 5546625.84
2019-08 3079564.18 3.079564e+06 3.079564e+06 3.079564e+06 3.079564e+06 3.079564e+06 3079564.18
2019-09 70642020.00 7.064202e+07 7.064202e+07 7.064202e+07 7.064202e+07 7.064202e+07 70642020.00
2020-07 2790887.50 2.794907e+06 2.810985e+06 2.831082e+06 2.917197e+06 2.986089e+06 3003312.48
2020-09 7666398.16 7.666398e+06 7.666398e+06 7.666398e+06 7.666398e+06 7.666398e+06 7666398.16
2020-10 4748825.40 4.748825e+06 4.748825e+06 4.748825e+06 4.748825e+06 4.748825e+06 4748825.40
2020-11 11283611.40 1.151760e+07 1.245355e+07 1.362350e+07 1.479344e+07 1.572939e+07 15963381.24
2021-06 28295903.04 2.829590e+07 2.829590e+07 2.829590e+07 2.829590e+07 2.829590e+07 28295903.04
2021-12 6590562.95 1.119795e+07 2.962750e+07 5.266444e+07 5.283919e+07 5.297899e+07 53013945.72
2022-02 4756719.00 5.186794e+06 6.907092e+06 9.057465e+06 9.145936e+06 9.216713e+06 9234407.38
2022-11 4851051.93 4.851052e+06 4.851052e+06 4.851052e+06 4.851052e+06 4.851052e+06 4851051.93
2023-06 4178913.20 4.178913e+06 4.178913e+06 4.178913e+06 4.178913e+06 4.178913e+06 4178913.20
2023-07 4622472.00 4.629070e+06 4.655463e+06 4.688453e+06 4.721444e+06 4.747836e+06 4754434.36

Q5 b¶

Calculate the percentile range of market caps one month before stocks entering

In [ ]:
first_dates = sp500_data.groupby('permno')['date'].min()
first_dates_df = first_dates.reset_index()
first_dates_df.columns = ['permno', 'date']

upon_entering = pd.merge(sp500_data, first_dates_df, how = 'inner', on = ['permno', 'date'])
upon_entering['date'] = upon_entering['date'].dt.to_period('M')

entering_percentiles = upon_entering.groupby('date')['mcap'].quantile(np.array(percentiles)).unstack()

entering_percentiles
Out[ ]:
0.00 0.05 0.25 0.50 0.75 0.95 1.00
date
2011-01 1.546778e+06 3.083643e+06 6.678938e+06 1.147552e+07 2.218710e+07 1.007589e+08 4.068335e+08
2011-02 1.020416e+07 1.020416e+07 1.020416e+07 1.020416e+07 1.020416e+07 1.020416e+07 1.020416e+07
2011-03 2.565047e+07 2.565047e+07 2.565047e+07 2.565047e+07 2.565047e+07 2.565047e+07 2.565047e+07
2011-04 8.302238e+06 8.463407e+06 9.108084e+06 9.913930e+06 1.791928e+07 2.432356e+07 2.592463e+07
2011-06 5.492196e+06 6.031846e+06 8.190446e+06 1.088870e+07 1.358694e+07 1.574554e+07 1.628519e+07
... ... ... ... ... ... ... ...
2023-06 7.814901e+07 7.814901e+07 7.814901e+07 7.814901e+07 7.814901e+07 7.814901e+07 7.814901e+07
2023-08 4.413831e+07 4.413831e+07 4.413831e+07 4.413831e+07 4.413831e+07 4.413831e+07 4.413831e+07
2023-09 5.968635e+07 6.050417e+07 6.377542e+07 6.786448e+07 7.195355e+07 7.522480e+07 7.604262e+07
2023-10 1.448573e+07 1.473669e+07 1.574049e+07 1.699525e+07 3.232717e+07 4.459270e+07 4.765909e+07
2023-12 2.059155e+07 2.589709e+07 4.711924e+07 7.364693e+07 1.001746e+08 1.213968e+08 1.267023e+08

124 rows × 7 columns

Q5 i-vi¶

In [ ]:
df = sp500_data.copy()
df.set_index('date', inplace=True)
df = df.sort_values(['permno', 'date'])

def geometric_return(returns):
    return np.prod(1+returns) - 1

# (i)	Trailing twelve month return based on prc (we will refer to this PRC_Ret(T12))
df['PRC_Ret(T12)'] = df.groupby('permno')['prc'].transform(lambda x: x.rolling(12).apply(geometric_return, raw = True))
# (ii)	Trailing twelve month return based on prices (we will refer to this Prices_Ret(T12))
df['Prices_Ret(T12)'] = df.groupby('permno')['Price_Ret(T1)'].transform(lambda x: x.rolling(12).apply(geometric_return, raw = True))

# (iii)	Trailing twelve month return excluding the most recent trailing month based on prc. 
# In other words, this eleven months of return starting from 12 months ago excluding the most recent month (we will to this as PRC_Ret(T12M1))
df['PRC_Ret(T12M1)'] = df.groupby('permno')['prc'].transform(lambda x: x.shift(1).rolling(window=11).apply(geometric_return, raw = True))
# (iv)	Trailing twelve month return excluding the most recent trailing month based on prices. 
# In other words, this eleven months of return starting from 12 months ago excluding the most recent month (we will to this as Prices_Ret(T12M1))
df['Prices_Ret(T12M1)'] = df.groupby('permno')['Price_Ret(T1)'].transform(lambda x: x.shift(1).rolling(window=11).apply(geometric_return, raw = True))

# one-month-return from 12 months ago
# (v)	The trailing one month return from exactly 12 months ago. 
# In other words, if the EOM period is 2019-03 (March 2019), we want the return for the stock for 2018-03 (March 2018). 
# Do this based both on prc column and price based return. We will refer to this as PRC_Ret(T12_1M) and Prices_Ret(T12_1M)
df['PRC_Ret(T12_1M)'] = df.groupby('permno')['prc'].shift(12)
df['Prices_Ret(T12_1M)'] =  df.groupby('permno')['Price_Ret(T1)'].shift(12)

# standard deviation of the monthly price based returns
# (vi)	Calculate the standard deviation of the monthly price_based returns used in calculating Prices_Ret(T12M1); we call this Vol_Prices_Ret(T12M1)
df['Vol_Prices_Ret(T12M1)'] = df.groupby('permno')['Price_Ret(T1)'].transform(lambda x: x.shift(1).rolling(window=11).std())
# Divide Prices_Ret(T12M1) / Vol_Prices_Ret(T12M1). We will call this SR_Prices_Ret(T12M1)
df['SR_Prices_Ret(T12M1)'] = df['Prices_Ret(T12M1)'] / df['Vol_Prices_Ret(T12M1)']

df[df['permno'] == 13688].head(15)
Out[ ]:
permno price shrout prc mcap Price_Ret(T1) PRC_Ret(T12) Prices_Ret(T12) PRC_Ret(T12M1) Prices_Ret(T12M1) PRC_Ret(T12_1M) Prices_Ret(T12_1M) Vol_Prices_Ret(T12M1) SR_Prices_Ret(T12M1)
date
2011-01-31 13688 46.28 392066 -0.032609 18144814.48 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2011-02-28 13688 46.06 396258 -0.004754 18251643.48 -0.004754 NaN NaN NaN NaN NaN NaN NaN NaN
2011-03-31 13688 44.18 396789 -0.030938 17530138.02 -0.040816 NaN NaN NaN NaN NaN NaN NaN NaN
2011-04-29 13688 46.08 396789 0.043006 18284037.12 0.043006 NaN NaN NaN NaN NaN NaN NaN NaN
2011-05-31 13688 43.38 397950 -0.058594 17263071.00 -0.058594 NaN NaN NaN NaN NaN NaN NaN NaN
2011-06-30 13688 42.03 397950 -0.020632 16725838.50 -0.031120 NaN NaN NaN NaN NaN NaN NaN NaN
2011-07-29 13688 41.43 397950 -0.014276 16487068.50 -0.014276 NaN NaN NaN NaN NaN NaN NaN NaN
2011-08-31 13688 42.35 402245 0.022206 17035075.75 0.022206 NaN NaN NaN NaN NaN NaN NaN NaN
2011-09-30 13688 42.30 402245 0.009563 17014963.50 -0.001181 NaN NaN NaN NaN NaN NaN NaN NaN
2011-10-31 13688 42.90 402245 0.014185 17256310.50 0.014184 NaN NaN NaN NaN NaN NaN NaN NaN
2011-11-30 13688 38.84 405883 -0.094639 15764495.72 -0.094639 NaN NaN NaN NaN NaN NaN NaN NaN
2011-12-30 13688 41.22 405883 0.072992 16730497.26 0.061277 -0.100800 NaN -0.161969 NaN NaN NaN NaN NaN
2012-01-31 13688 40.66 405883 -0.013586 16503202.78 -0.013586 -0.083118 -0.121435 -0.070489 -0.109334 -0.032609 NaN 0.045336 -2.411622
2012-02-29 13688 41.68 412900 0.025086 17209672.00 0.025086 -0.055627 -0.095093 -0.078738 -0.117238 -0.004754 -0.004754 0.045322 -2.586795
2012-03-30 13688 43.41 421211 0.052423 18284769.51 0.041507 0.025610 -0.017429 -0.025478 -0.056587 -0.030938 -0.040816 0.045243 -1.250717
In [ ]:
# Calculate 1-month, 3-month, 6-month returns
df['PRC_Ret(F1M)'] = df.groupby('permno')['prc'].shift(-1)
df['PRC_Ret(F3M)'] = df.groupby('permno')['prc'].rolling(3).apply(geometric_return).groupby(level=0).shift(-3).reset_index(level=0, drop=True)
df['PRC_Ret(F6M)'] = df.groupby('permno')['prc'].rolling(6).apply(geometric_return).groupby(level=0).shift(-6).reset_index(level=0, drop=True)

df[df['permno'] == 13688].head(20)
Out[ ]:
permno price shrout prc mcap Price_Ret(T1) PRC_Ret(T12) Prices_Ret(T12) PRC_Ret(T12M1) Prices_Ret(T12M1) PRC_Ret(T12_1M) Prices_Ret(T12_1M) Vol_Prices_Ret(T12M1) SR_Prices_Ret(T12M1) PRC_Ret(F1M) PRC_Ret(F3M) PRC_Ret(F6M)
date
2011-01-31 13688 46.28 392066 -0.032609 18144814.48 NaN NaN NaN NaN NaN NaN NaN NaN NaN -0.004754 0.005932 -0.085788
2011-02-28 13688 46.06 396258 -0.004754 18251643.48 -0.004754 NaN NaN NaN NaN NaN NaN NaN NaN -0.030938 -0.048486 -0.061023
2011-03-31 13688 44.18 396789 -0.030938 17530138.02 -0.040816 NaN NaN NaN NaN NaN NaN NaN NaN 0.043006 -0.038366 -0.021779
2011-04-29 13688 46.08 396789 0.043006 18284037.12 0.043006 NaN NaN NaN NaN NaN NaN NaN NaN -0.058594 -0.091179 -0.048810
2011-05-31 13688 43.38 397950 -0.058594 17263071.00 -0.058594 NaN NaN NaN NaN NaN NaN NaN NaN -0.020632 -0.013176 -0.085230
2011-06-30 13688 42.03 397950 -0.020632 16725838.50 -0.031120 NaN NaN NaN NaN NaN NaN NaN NaN -0.014276 0.017249 0.002219
2011-07-29 13688 41.43 397950 -0.014276 16487068.50 -0.014276 NaN NaN NaN NaN NaN NaN NaN NaN 0.022206 0.046620 0.002921
2011-08-31 13688 42.35 402245 0.022206 17035075.75 0.022206 NaN NaN NaN NaN NaN NaN NaN NaN 0.009563 -0.073016 0.005746
2011-09-30 13688 42.30 402245 0.009563 17014963.50 -0.001181 NaN NaN NaN NaN NaN NaN NaN NaN 0.014185 -0.014775 0.048444
2011-10-31 13688 42.90 402245 0.014185 17256310.50 0.014184 NaN NaN NaN NaN NaN NaN NaN NaN -0.094639 -0.041753 0.052117
2011-11-30 13688 38.84 405883 -0.094639 15764495.72 -0.094639 NaN NaN NaN NaN NaN NaN NaN NaN 0.072992 0.084966 0.149471
2011-12-30 13688 41.22 405883 0.072992 16730497.26 0.061277 -0.100800 NaN -0.161969 NaN NaN NaN NaN NaN -0.013586 0.064167 0.120918
2012-01-31 13688 40.66 405883 -0.013586 16503202.78 -0.013586 -0.083118 -0.121435 -0.070489 -0.109334 -0.032609 NaN 0.045336 -2.411622 0.025086 0.097960 0.158697
2012-02-29 13688 41.68 412900 0.025086 17209672.00 0.025086 -0.055627 -0.095093 -0.078738 -0.117238 -0.004754 -0.004754 0.045322 -2.586795 0.052423 0.059453 0.063001
2012-03-30 13688 43.41 421211 0.052423 18284769.51 0.041507 0.025610 -0.017429 -0.025478 -0.056587 -0.030938 -0.040816 0.045243 -1.250717 0.017738 0.053329 0.003421
2012-04-30 13688 44.18 415354 0.017738 18350339.72 0.017738 0.000763 -0.041233 -0.016679 -0.057943 0.043006 0.043006 0.045089 -1.285087 -0.010865 0.055318 -0.017533
2012-05-31 13688 43.70 422320 -0.010865 18455384.00 -0.010865 0.051502 0.007377 0.063052 0.018442 -0.058594 -0.058594 0.041669 0.442571 0.046339 0.003349 -0.043417
2012-06-29 13688 45.27 422320 0.046339 19118426.40 0.035927 0.123405 0.077088 0.073653 0.039734 -0.020632 -0.031120 0.040467 0.981883 0.019660 -0.047382 -0.092813
2012-07-31 13688 46.16 422320 0.019660 19494291.20 0.019660 0.162081 0.114168 0.139675 0.092686 -0.014276 -0.014276 0.040988 2.261305 -0.059575 -0.069033 -0.055833
2012-08-31 13688 43.41 426463 -0.059575 18512758.83 -0.059575 0.069110 0.025030 0.136837 0.089965 0.022206 0.022206 0.040912 2.198959 -0.006565 -0.046610 0.003979
In [ ]:
df[df['permno'] == 13688].tail(20)
Out[ ]:
permno price shrout prc mcap Price_Ret(T1) PRC_Ret(T12) Prices_Ret(T12) PRC_Ret(T12M1) Prices_Ret(T12M1) PRC_Ret(T12_1M) Prices_Ret(T12_1M) Vol_Prices_Ret(T12M1) SR_Prices_Ret(T12M1) PRC_Ret(F1M) PRC_Ret(F3M) PRC_Ret(F6M)
date
2018-08-31 13688 46.18 517151 0.071959 23882033.18 0.071959 -0.338740 -0.343848 -0.383129 -0.387894 0.039740 0.039740 0.073846 -5.252737 -0.003681 -0.428756 -0.354049
2018-09-28 13688 46.01 517151 -0.003681 23794117.51 -0.003681 -0.324276 -0.324277 -0.321780 -0.321780 -0.025007 -0.032538 0.081380 -3.954051 0.017388 -0.483808 -0.328537
2018-10-31 13688 46.81 517151 0.017388 24207838.31 0.017388 -0.189717 -0.189718 -0.203565 -0.203566 -0.151564 -0.151564 0.071131 -2.861863 -0.436445 -0.393997 -0.354625
2018-11-30 13688 26.38 518674 -0.436445 13682620.12 -0.436445 -0.513643 -0.513643 -0.136983 -0.136984 -0.061104 -0.061104 0.070312 -1.948229 -0.099697 0.130781 0.125019
2018-12-31 13688 23.75 518674 -0.099697 12318507.50 -0.099697 -0.470221 -0.470221 -0.411554 -0.411555 -0.173488 -0.173488 0.140641 -2.926289 0.194400 0.300800 0.293600
2022-10-31 13688 14.93 1987700 0.194400 29676361.00 -0.371368 -0.331439 -0.648126 -0.440254 -0.440255 -0.053536 -0.053536 0.141937 -3.101749 0.051574 0.064970 0.146015
2022-11-30 13688 15.70 1987700 0.051574 31206890.00 0.051574 -0.274032 -0.617912 -0.309637 -0.636651 -0.031581 -0.031581 0.173600 -3.667350 0.035669 -0.005095 0.078980
2022-12-30 13688 16.26 1987700 0.035669 32320002.00 0.035669 -0.296745 -0.629866 -0.320965 -0.642613 0.069117 0.069117 0.172270 -3.730258 -0.022140 -0.005535 0.062730
2023-01-31 13688 15.90 1987700 -0.022140 31604430.00 -0.022140 -0.344685 -0.655098 -0.329848 -0.647289 0.049397 0.049397 0.171354 -3.777486 -0.017610 0.076100 0.107546
2023-02-28 13688 15.62 1987700 -0.017610 31047874.00 -0.017610 -0.315070 -0.639511 -0.302792 -0.633049 -0.060087 -0.060087 0.172015 -3.680186 0.035211 0.084506 0.043532
2023-03-31 13688 16.17 1987785 0.035211 32142483.45 0.035211 -0.278124 -0.620066 -0.302678 -0.632989 -0.017771 -0.017771 0.172020 -3.679734 0.058132 0.068645 -0.002475
2023-04-28 13688 17.11 1987785 0.058132 34011001.35 0.058132 -0.245380 -0.602832 -0.286838 -0.624652 0.012218 0.012218 0.173245 -3.605598 -0.009936 0.029222 -0.047342
2023-05-31 13688 16.94 1995778 -0.009936 33808479.32 -0.009936 -0.303031 -0.633175 -0.296037 -0.629493 0.071959 0.071959 0.172181 -3.655991 0.020071 -0.037781 0.013576
2023-06-30 13688 17.28 1995778 0.020071 34487043.84 0.020071 -0.286416 -0.624429 -0.300456 -0.631819 -0.003681 -0.003681 0.171956 -3.674303 0.019097 -0.066552 0.043980
2023-07-31 13688 17.61 2568985 0.019097 45239825.85 0.019097 -0.285217 -0.623798 -0.298612 -0.630848 0.017388 0.017388 0.172093 -3.665742 -0.074390 -0.074391 NaN
2023-08-31 13688 16.30 2091241 -0.074390 34087228.30 -0.074390 0.173994 -0.382108 0.268346 -0.332449 -0.436445 -0.436445 0.122407 -2.715922 -0.010430 0.053373 NaN
2023-09-29 13688 16.13 2091241 -0.010430 33731717.33 -0.010429 0.290398 -0.320842 0.303999 -0.313684 -0.099697 -0.099697 0.121143 -2.589367 0.010539 0.118412 NaN
2023-10-31 13688 16.30 2133508 0.010539 34776180.40 0.010539 0.091760 0.091762 0.080373 0.080375 0.194400 -0.371368 0.038987 2.061590 0.053374 NaN NaN
2023-11-30 13688 17.17 2133508 0.053374 36632332.36 0.053374 0.093628 0.093631 0.038215 0.038217 0.051574 0.051574 0.036241 1.054517 0.050670 NaN NaN
2023-12-29 13688 18.03 2133508 0.050670 38467149.24 0.050087 0.109469 0.108856 0.055963 0.055966 0.035669 0.035669 0.038131 1.467721 NaN NaN NaN
In [ ]:
df.columns
Out[ ]:
Index(['permno', 'price', 'shrout', 'prc', 'mcap', 'Price_Ret(T1)',
       'PRC_Ret(T12)', 'Prices_Ret(T12)', 'PRC_Ret(T12M1)',
       'Prices_Ret(T12M1)', 'PRC_Ret(T12_1M)', 'Prices_Ret(T12_1M)',
       'Vol_Prices_Ret(T12M1)', 'SR_Prices_Ret(T12M1)', 'PRC_Ret(F1M)',
       'PRC_Ret(F3M)', 'PRC_Ret(F6M)'],
      dtype='object')

Q6¶

plot the time series of different variables (unnormalized)

In [ ]:
var_list = ['Price_Ret(T1)',
            'PRC_Ret(T12)', 'Prices_Ret(T12)', 
            'PRC_Ret(T12M1)', 'Prices_Ret(T12M1)', 
            'PRC_Ret(T12_1M)', 'Prices_Ret(T12_1M)',
            'Vol_Prices_Ret(T12M1)', 'SR_Prices_Ret(T12M1)', 
            'PRC_Ret(F1M)', 'PRC_Ret(F3M)', 'PRC_Ret(F6M)']
In [ ]:
medians_dict = {}

for var in var_list:
    # Calculate percentiles
    percentiles_df = df.groupby('date')[var].describe(percentiles=[0.0, 0.05, 0.25, 0.5, 0.75, 0.95, 1.0]).reset_index()
    percentiles_df = percentiles_df[['date', 'min', '5%', '25%', '50%', '75%', '95%', 'max']]
    
    # Melt the dataframe
    melted_df = percentiles_df.melt(id_vars=['date'], var_name='Percentile', value_name=var)

    # Plotting
    plt.figure(figsize=(10, 5))
    sns.lineplot(data=melted_df, x='date', y=var, hue='Percentile')
    plt.title(f'Daily Percentiles for {var}')
    plt.xlabel('Date')
    plt.ylabel(var)
    plt.legend(title='Percentile',  fontsize="8", loc ="upper left")
    plt.show()
    
    # Calculating median values for each percentile
    medians_dict[var] = {}
    for percentile in ['min', '5%', '25%', '50%', '75%', '95%', 'max']:
        # Filter the dataframe for the current percentile
        filtered_df = melted_df[melted_df['Percentile'] == percentile]
        # Calculate the median of these values
        median_value = filtered_df[var].median()
        medians_dict[var][percentile] = median_value

# Print median values for each percentile of each variable
for var, percentiles in medians_dict.items():
    print(f"\n{var}:")
    for percentile, median_value in percentiles.items():
        print(f"  Median of the {percentile} percentile: {median_value}")
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Price_Ret(T1):
  Median of the min percentile: -0.39167808645820357
  Median of the 5% percentile: -0.08980909941590991
  Median of the 25% percentile: -0.026080186843129516
  Median of the 50% percentile: 0.009840098400984099
  Median of the 75% percentile: 0.04500901155123471
  Median of the 95% percentile: 0.10878458921692565
  Median of the max percentile: 0.351092168353756

PRC_Ret(T12):
  Median of the min percentile: -0.6021358561790884
  Median of the 5% percentile: -0.2601085754336827
  Median of the 25% percentile: -0.029966241406540944
  Median of the 50% percentile: 0.11711134844528892
  Median of the 75% percentile: 0.2663292586916528
  Median of the 95% percentile: 0.5185331023066828
  Median of the max percentile: 1.4529489569958174

Prices_Ret(T12):
  Median of the min percentile: -0.7557145651081932
  Median of the 5% percentile: -0.30622521054976753
  Median of the 25% percentile: -0.05933295773842828
  Median of the 50% percentile: 0.08761902974020108
  Median of the 75% percentile: 0.23487339000570143
  Median of the 95% percentile: 0.4931673249181745
  Median of the max percentile: 3.0724807729108154

PRC_Ret(T12M1):
  Median of the min percentile: -0.5935068160952284
  Median of the 5% percentile: -0.25537950746545657
  Median of the 25% percentile: -0.027062382765922988
  Median of the 50% percentile: 0.10222840543367773
  Median of the 75% percentile: 0.23730038875006415
  Median of the 95% percentile: 0.48209692052144026
  Median of the max percentile: 1.34414294810801

Prices_Ret(T12M1):
  Median of the min percentile: -0.7517020803351337
  Median of the 5% percentile: -0.28932406616548423
  Median of the 25% percentile: -0.05661931829933811
  Median of the 50% percentile: 0.07559991404392996
  Median of the 75% percentile: 0.2093162870674058
  Median of the 95% percentile: 0.4624807893640177
  Median of the max percentile: 2.773236232603188

PRC_Ret(T12_1M):
  Median of the min percentile: -0.235482
  Median of the 5% percentile: -0.08451669999999999
  Median of the 25% percentile: -0.02132025
  Median of the 50% percentile: 0.01242625
  Median of the 75% percentile: 0.04806275
  Median of the 95% percentile: 0.1090796
  Median of the max percentile: 0.3072145

Prices_Ret(T12_1M):
  Median of the min percentile: -0.4043392504930967
  Median of the 5% percentile: -0.0869153216034124
  Median of the 25% percentile: -0.022640108870468856
  Median of the 50% percentile: 0.010797638145016442
  Median of the 75% percentile: 0.04478668054527535
  Median of the 95% percentile: 0.1072377184028884
  Median of the max percentile: 0.33552521058820717

Vol_Prices_Ret(T12M1):
  Median of the min percentile: 0.023235423243071512
  Median of the 5% percentile: 0.036986464081294496
  Median of the 25% percentile: 0.052832811519830866
  Median of the 50% percentile: 0.06965348806941386
  Median of the 75% percentile: 0.0883775945724741
  Median of the 95% percentile: 0.1380265945113937
  Median of the max percentile: 0.8108228070518557

SR_Prices_Ret(T12M1):
  Median of the min percentile: -6.536362197934961
  Median of the 5% percentile: -3.1516797428505585
  Median of the 25% percentile: -0.783304308343572
  Median of the 50% percentile: 1.2254821693188644
  Median of the 75% percentile: 3.6666254972570447
  Median of the 95% percentile: 7.626656448053435
  Median of the max percentile: 16.716384046487384

PRC_Ret(F1M):
  Median of the min percentile: -0.250578
  Median of the 5% percentile: -0.0877414
  Median of the 25% percentile: -0.021781000000000002
  Median of the 50% percentile: 0.011755
  Median of the 75% percentile: 0.0472615
  Median of the 95% percentile: 0.10952209999999996
  Median of the max percentile: 0.335631

PRC_Ret(F3M):
  Median of the min percentile: -0.3746256641295582
  Median of the 5% percentile: -0.14092031136842478
  Median of the 25% percentile: -0.030480130694339747
  Median of the 50% percentile: 0.03312106180321894
  Median of the 75% percentile: 0.10005245330751539
  Median of the 95% percentile: 0.20641178984396635
  Median of the max percentile: 0.5561335475798863

PRC_Ret(F6M):
  Median of the min percentile: -0.5019132744404435
  Median of the 5% percentile: -0.19659327999721904
  Median of the 25% percentile: -0.038681625917077356
  Median of the 50% percentile: 0.05850310626427857
  Median of the 75% percentile: 0.1572815415464443
  Median of the 95% percentile: 0.31806958610497854
  Median of the max percentile: 0.7977277070728639

normalization

In [ ]:
norm_df = df.copy()
In [ ]:
def normalization(x):
    return (x - x.mean()) / x.std()

# Apply the cross-sectional normalization 
for var in var_list:
    norm_df[var] = norm_df.groupby(norm_df.index)[var].transform(normalization)
In [ ]:
norm_df.describe()
Out[ ]:
permno price shrout prc mcap Price_Ret(T1) PRC_Ret(T12) Prices_Ret(T12) PRC_Ret(T12M1) Prices_Ret(T12M1) PRC_Ret(T12_1M) Prices_Ret(T12_1M) Vol_Prices_Ret(T12M1) SR_Prices_Ret(T12M1) PRC_Ret(F1M) PRC_Ret(F3M) PRC_Ret(F6M)
count 78481.000000 78481.000000 7.848100e+04 78440.000000 7.848100e+04 7.772000e+04 7.024300e+04 6.955400e+04 7.024300e+04 6.955400e+04 6.951900e+04 6.883100e+04 6.955400e+04 69554.000000 7.772000e+04 7.620700e+04 7.396800e+04
mean 55562.798716 111.790334 6.116220e+05 0.010668 4.677985e+07 -4.022630e-18 2.427719e-18 -3.269024e-18 -2.427719e-18 -3.269024e-18 -2.044168e-18 -3.096901e-18 -2.451768e-17 0.000000 -2.742702e-18 6.340219e-18 -3.458190e-18
std 29005.295471 206.740954 1.116013e+06 0.085216 1.115109e+08 9.990088e-01 9.989744e-01 9.989715e-01 9.989744e-01 9.989715e-01 9.989710e-01 9.989679e-01 9.989715e-01 0.998971 9.990088e-01 9.990022e-01 9.989923e-01
min 10104.000000 2.050000 3.179000e+03 -0.886269 6.203644e+05 -1.233978e+01 -4.415152e+00 -4.617517e+00 -4.248586e+00 -4.924631e+00 -6.740546e+00 -1.267789e+01 -2.096707e+00 -3.859849 -1.063666e+01 -7.420586e+00 -5.197883e+00
25% 24010.000000 39.885000 1.591360e+05 -0.035815 1.084019e+07 -4.943571e-01 -6.118526e-01 -4.984432e-01 -6.093929e-01 -4.978291e-01 -5.807216e-01 -4.985315e-01 -4.732970e-01 -0.692103 -5.712042e-01 -5.989960e-01 -6.117191e-01
50% 60943.000000 67.800000 3.004150e+05 0.011156 1.966517e+07 -3.180546e-03 -5.660440e-02 -5.032311e-02 -5.377272e-02 -4.794963e-02 -1.939954e-03 -8.548682e-04 -1.752337e-01 -0.115771 -2.809995e-03 -1.338180e-02 -2.993152e-02
75% 83143.000000 117.410000 5.844920e+05 0.055954 4.184465e+07 4.960473e-01 5.392211e-01 4.226046e-01 5.435153e-01 4.281541e-01 5.707045e-01 5.043007e-01 2.319764e-01 0.577867 5.594344e-01 5.790098e-01 5.667353e-01
max 93436.000000 7000.450200 2.920640e+07 2.135168 3.071345e+09 2.202466e+01 1.301485e+01 2.006791e+01 1.317903e+01 2.049279e+01 1.332261e+01 2.167890e+01 2.109337e+01 7.696053 1.321799e+01 1.186595e+01 1.171013e+01
In [ ]:
norm_df[norm_df['permno'] == 13688].tail(20)
Out[ ]:
permno price shrout prc mcap Price_Ret(T1) PRC_Ret(T12) Prices_Ret(T12) PRC_Ret(T12M1) Prices_Ret(T12M1) PRC_Ret(T12_1M) Prices_Ret(T12_1M) Vol_Prices_Ret(T12M1) SR_Prices_Ret(T12M1) PRC_Ret(F1M) PRC_Ret(F3M) PRC_Ret(F6M)
date
2018-08-31 13688 46.18 517151 0.071959 23882033.18 0.838340 -1.940569 -1.809446 -2.278698 -2.125198 0.781315 0.816316 0.316145 -2.047335 -0.097830 -3.221805 -2.328881
2018-09-28 13688 46.01 517151 -0.003681 23794117.51 -0.062808 -1.844704 -1.703063 -1.923185 -1.781724 -0.860955 -0.857825 0.644607 -1.594592 1.044488 -2.700666 -1.979743
2018-10-31 13688 46.81 517151 0.017388 24207838.31 1.058523 -1.017202 -0.908586 -1.336724 -1.218756 -2.502201 -2.240800 0.336968 -1.218128 -5.887187 -3.946722 -2.881442
2018-11-30 13688 26.38 518674 -0.436445 13682620.12 -5.567856 -2.476929 -2.326814 -0.712910 -0.602887 -1.499018 -1.463835 0.087739 -0.605428 0.009959 1.073157 0.763731
2018-12-31 13688 23.75 518674 -0.099697 12318507.50 0.049782 -1.837462 -1.718570 -2.006677 -1.869544 -3.411106 -3.314734 2.592583 -0.997873 1.252255 1.237598 0.562316
2022-10-31 13688 14.93 1987700 0.194400 29676361.00 -4.578703 -0.845862 -1.792281 -1.141477 -0.999701 -1.411569 -1.202048 1.340508 -0.450277 -0.214798 -0.200492 0.448685
2022-11-30 13688 15.70 1987700 0.051574 31206890.00 -0.136373 -0.900287 -1.878631 -0.819356 -1.775541 -0.090407 -0.035507 1.984494 -0.992079 1.501962 0.075840 0.687993
2022-12-30 13688 16.26 1987700 0.035669 32320002.00 0.522297 -0.690041 -1.722367 -0.923627 -1.906484 0.085009 0.109053 1.722606 -1.101851 -1.055048 -0.272488 -0.101246
2023-01-31 13688 15.90 1987700 -0.022140 31604430.00 -1.038114 -1.521615 -2.383467 -1.096665 -2.103911 0.960206 0.971058 1.225713 -1.190937 0.253201 0.851233 0.364914
2023-02-28 13688 15.62 1987700 -0.017610 31047874.00 0.285922 -1.359490 -2.292594 -1.472089 -2.370850 -0.602936 -0.570525 1.152623 -1.704291 0.411378 0.832687 -0.003445
2023-03-31 13688 16.17 1987785 0.035211 32142483.45 0.434698 -1.058395 -1.934625 -1.315498 -2.325002 -0.573437 -0.533598 1.170932 -1.439759 0.793134 0.211025 0.027878
2023-04-28 13688 17.11 1987785 0.058132 34011001.35 0.807505 -1.128450 -2.092471 -1.353444 -2.076127 0.947464 0.964085 1.199831 -1.502387 0.322579 -0.325374 0.006359
2023-05-31 13688 16.94 1995778 -0.009936 33808479.32 0.344636 -1.146777 -1.776124 -1.386558 -2.058255 0.715837 0.745695 1.233189 -1.520544 -0.770847 -0.984749 -0.375167
2023-06-30 13688 17.28 1995778 0.020071 34487043.84 -0.743447 -1.435482 -2.077505 -1.427737 -2.057421 1.153905 0.987908 1.376828 -1.584903 -0.240620 -0.176069 -0.157439
2023-07-31 13688 17.61 2568985 0.019097 45239825.85 -0.215000 -1.462780 -2.149704 -1.406606 -2.165069 -0.878636 -0.627945 1.454083 -1.583424 -0.634540 0.320555 NaN
2023-08-31 13688 16.30 2091241 -0.074390 34087228.30 -0.526028 0.278687 -1.268839 0.523082 -1.209264 -6.014655 -5.086610 0.523787 -1.483973 0.784113 0.532947 NaN
2023-09-29 13688 16.13 2091241 -0.010430 33731717.33 0.819754 0.467702 -1.311946 0.283370 -1.337195 -0.118207 -0.081338 0.601206 -1.456317 0.686640 -0.012088 NaN
2023-10-31 13688 16.30 2133508 0.010539 34776180.40 0.701242 0.288897 0.312691 0.095761 0.136870 0.983379 -4.650653 -0.888084 0.585127 -0.491400 NaN NaN
2023-11-30 13688 17.17 2133508 0.053374 36632332.36 -0.455822 0.234010 0.265630 0.412329 0.411065 -0.260586 -0.169683 -0.882816 0.734855 -0.264917 NaN NaN
2023-12-29 13688 18.03 2133508 0.050670 38467149.24 -0.118962 -0.175711 -0.092587 -0.122094 -0.032014 1.521040 0.512341 -1.364996 0.202221 NaN NaN NaN

plot the time series of different variables (normalized)

In [ ]:
medians_dict = {}

for var in var_list:
    # Calculate percentiles
    percentiles_df = norm_df.groupby('date')[var].describe(percentiles=[0.0, 0.05, 0.25, 0.5, 0.75, 0.95, 1.0]).reset_index()
    percentiles_df = percentiles_df[['date', 'min', '5%', '25%', '50%', '75%', '95%', 'max']]
    
    # Melt the dataframe
    melted_df = percentiles_df.melt(id_vars=['date'], var_name='Percentile', value_name=var)

    # Plotting
    plt.figure(figsize=(10, 5))
    sns.lineplot(data=melted_df, x='date', y=var, hue='Percentile')
    plt.title(f'Daily Percentiles for {var}')
    plt.xlabel('Date')
    plt.ylabel(var)
    plt.legend(title='Percentile',  fontsize="8", loc ="upper left")
    plt.show()
    
    # Calculating median values for each percentile
    medians_dict[var] = {}
    for percentile in ['min', '5%', '25%', '50%', '75%', '95%', 'max']:
        # Filter the dataframe for the current percentile
        filtered_df = melted_df[melted_df['Percentile'] == percentile]
        # Calculate the median of these values
        median_value = filtered_df[var].median()
        medians_dict[var][percentile] = median_value

# Print median values for each percentile of each variable
for var, percentiles in medians_dict.items():
    print(f"\n{var}:")
    for percentile, median_value in percentiles.items():
        print(f"  Median of the {percentile} percentile: {median_value}")
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Price_Ret(T1):
  Median of the min percentile: -5.198133709689026
  Median of the 5% percentile: -1.4437030258095545
  Median of the 25% percentile: -0.524688685724891
  Median of the 50% percentile: -0.0014935953394055615
  Median of the 75% percentile: 0.5342270934485629
  Median of the 95% percentile: 1.447557540184519
  Median of the max percentile: 4.838470475531766

PRC_Ret(T12):
  Median of the min percentile: -2.8122137017625346
  Median of the 5% percentile: -1.467993990581814
  Median of the 25% percentile: -0.6091559104894488
  Median of the 50% percentile: -0.05685047870553174
  Median of the 75% percentile: 0.5425013903778502
  Median of the 95% percentile: 1.6005720228106035
  Median of the max percentile: 5.149057318241457

Prices_Ret(T12):
  Median of the min percentile: -2.623484856196533
  Median of the 5% percentile: -1.315648860201541
  Median of the 25% percentile: -0.5263023796097035
  Median of the 50% percentile: -0.052164780835206134
  Median of the 75% percentile: 0.44391778732736
  Median of the 95% percentile: 1.3880584390999817
  Median of the max percentile: 9.737366764428469

PRC_Ret(T12M1):
  Median of the min percentile: -2.8301301754108787
  Median of the 5% percentile: -1.4926444633821028
  Median of the 25% percentile: -0.6098870236847719
  Median of the 50% percentile: -0.05389505337245977
  Median of the 75% percentile: 0.5514059392472089
  Median of the 95% percentile: 1.6040604838220167
  Median of the max percentile: 5.155394880887796

Prices_Ret(T12M1):
  Median of the min percentile: -2.693534184992621
  Median of the 5% percentile: -1.3341952696892703
  Median of the 25% percentile: -0.5193179568822941
  Median of the 50% percentile: -0.050156090692311334
  Median of the 75% percentile: 0.46044961079220365
  Median of the 95% percentile: 1.4188482241519078
  Median of the max percentile: 9.158296155924498

PRC_Ret(T12_1M):
  Median of the min percentile: -3.9413278721984
  Median of the 5% percentile: -1.55079159719361
  Median of the 25% percentile: -0.586017149784201
  Median of the 50% percentile: 0.0007278370506201418
  Median of the 75% percentile: 0.5698792839467659
  Median of the 95% percentile: 1.5437454768847636
  Median of the max percentile: 4.512428595710889

Prices_Ret(T12_1M):
  Median of the min percentile: -5.368377826968604
  Median of the 5% percentile: -1.462847497269528
  Median of the 25% percentile: -0.5299846673532286
  Median of the 50% percentile: 0.0008829260397438713
  Median of the 75% percentile: 0.5374256285206424
  Median of the 95% percentile: 1.4617555628557535
  Median of the max percentile: 4.592530937531173

Vol_Prices_Ret(T12M1):
  Median of the min percentile: -1.1862347612492783
  Median of the 5% percentile: -0.8603998246451678
  Median of the 25% percentile: -0.5352320465946132
  Median of the 50% percentile: -0.19496642714764345
  Median of the 75% percentile: 0.24956778785218497
  Median of the 95% percentile: 1.4122652895112808
  Median of the max percentile: 12.788111557892844

SR_Prices_Ret(T12M1):
  Median of the min percentile: -2.425651831754161
  Median of the 5% percentile: -1.4103678600567084
  Median of the 25% percentile: -0.6900796465285024
  Median of the 50% percentile: -0.11123105018101062
  Median of the 75% percentile: 0.5744054197609173
  Median of the 95% percentile: 1.758574158946772
  Median of the max percentile: 4.344820072932867

PRC_Ret(F1M):
  Median of the min percentile: -4.021855099907964
  Median of the 5% percentile: -1.5413294821610142
  Median of the 25% percentile: -0.5694740287258351
  Median of the 50% percentile: -0.003045486995341003
  Median of the 75% percentile: 0.5572185492712635
  Median of the 95% percentile: 1.519735239806513
  Median of the max percentile: 4.788424663800764

PRC_Ret(F3M):
  Median of the min percentile: -3.4982550225861266
  Median of the 5% percentile: -1.5560498359342942
  Median of the 25% percentile: -0.6043054757486624
  Median of the 50% percentile: -0.01170151028380133
  Median of the 75% percentile: 0.5813402540711379
  Median of the 95% percentile: 1.5671776796146417
  Median of the max percentile: 4.430211130750474

PRC_Ret(F6M):
  Median of the min percentile: -3.1576765948687218
  Median of the 5% percentile: -1.525464800769815
  Median of the 25% percentile: -0.6098176122325272
  Median of the 50% percentile: -0.028150177943277784
  Median of the 75% percentile: 0.5693101818271424
  Median of the 95% percentile: 1.5930240459374843
  Median of the max percentile: 4.524673937869889

Q7 Fama-McBeth Cross-sectional Regression¶

a)¶

In [ ]:
# Period: October 31st 2019 to November 30th 2019
# dependent var: PRC_Ret(F1M)
# independent variables: PRC_Ret(T12M1)

import statsmodels.api as sm

date = norm_df.index[norm_df.index <= '2019-11-30'].max()
print(date)
nov_2019_df = norm_df.loc[date].copy()
nov_2019_df.dropna(inplace=True)

X = sm.add_constant(nov_2019_df['PRC_Ret(T12M1)'])
Y = nov_2019_df['PRC_Ret(F1M)']

print(nov_2019_df.shape)

model = sm.OLS(Y, X).fit()
print(model.summary())
2019-11-29 00:00:00
(472, 17)
                            OLS Regression Results                            
==============================================================================
Dep. Variable:           PRC_Ret(F1M)   R-squared:                       0.035
Model:                            OLS   Adj. R-squared:                  0.033
Method:                 Least Squares   F-statistic:                     17.26
Date:                Tue, 09 Apr 2024   Prob (F-statistic):           3.87e-05
Time:                        09:03:36   Log-Likelihood:                -651.82
No. Observations:                 472   AIC:                             1308.
Df Residuals:                     470   BIC:                             1316.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==================================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------
const             -0.0167      0.044     -0.376      0.707      -0.104       0.071
PRC_Ret(T12M1)    -0.1907      0.046     -4.155      0.000      -0.281      -0.101
==============================================================================
Omnibus:                       69.236   Durbin-Watson:                   1.967
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              179.860
Skew:                           0.729   Prob(JB):                     8.79e-40
Kurtosis:                       5.649   Cond. No.                         1.05
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
  • Economic Interpretation: the negative coefficient of -0.1907 for PRC_Ret(T12M1) suggests a reversal effect, where stocks that had higher returns in the past eleven months (excluding the most recent month) tend to have lower returns in the following month. This could indicate that stocks which experienced an increase over the past eleven months could be overbought and may be subject to price corrections in the short-term future.
  • Statistical Interpretation:
    • The coefficient for PRC_Ret(T12M1) is -0.1907, with a standard error of 0.046. The negative coefficient suggests that there is an inverse relationship between the past returns (excluding the most recent month) and the forward one month returns.
    • The t-statistic for PRC_Ret(T12M1) is -4.155 and the p-value is 0.000, indicating that the relationship is statistically significant at conventional levels (p < 0.05).
    • The R-squared of the model is 0.035, which means that approximately 3.5% of the variability in forward one month returns can be explained by the past eleven months of returns. This suggests that while the model has found a significant relationship, it explains a relatively small portion of the variance in future returns.

b) Cross-sectinal regression¶

In [ ]:
all_df = norm_df.copy()
all_df.dropna(inplace=True)

all_df_bfnorm = df.copy()
all_df_bfnorm.dropna(inplace=True)

dates = all_df.index.unique()
coefficients = []
p_values = []


for period in dates:
    period_data = all_df.loc[period].copy()
    
    X = sm.add_constant(period_data['PRC_Ret(T12M1)'])
    Y = period_data['PRC_Ret(F1M)']
        
    model = sm.OLS(Y, X).fit()
    coefficients.append(model.params['PRC_Ret(T12M1)'])
    p_values.append(model.pvalues['PRC_Ret(T12M1)'])

results_df = pd.DataFrame({'Date': dates, 'Coefficient': coefficients, 'P-Value': p_values})

plt.figure(figsize=(8, 3))
plt.plot(results_df['Date'], results_df['Coefficient'], label='Coefficient')
plt.axhline(0, color='grey', lw=0.5) 
plt.title('Time Series of PRC_Ret(T12M1) Coefficients')
plt.xlabel('Date')
plt.ylabel('Coefficient')

significant_periods = results_df[results_df['P-Value'] < 0.05]
plt.scatter(significant_periods['Date'], significant_periods['Coefficient'], color='red', label='Statistically Significant (p < 0.05)')

plt.legend(fontsize="8", loc ="upper left")
plt.show()
No description has been provided for this image
In [ ]:
print(results_df[-50:-30])
          Date  Coefficient       P-Value
87  2019-05-31    -0.223891  1.114739e-06
88  2019-06-28    -0.009623  8.357974e-01
89  2019-07-31     0.458781  3.191481e-26
90  2019-08-30    -0.426440  3.013890e-22
91  2019-09-30    -0.156068  7.642035e-04
92  2019-10-31    -0.250612  3.123300e-08
93  2019-11-29    -0.190704  3.870269e-05
94  2019-12-31     0.261179  1.214144e-08
95  2020-01-31     0.152727  1.047135e-03
96  2020-02-28     0.320003  8.366097e-13
97  2020-03-31    -0.319613  3.585846e-12
98  2020-04-30     0.190599  2.808616e-05
99  2020-05-29    -0.185043  6.144411e-05
100 2020-06-30     0.274666  1.962104e-09
101 2020-07-31    -0.092968  4.729591e-02
102 2020-08-31     0.079700  8.480634e-02
103 2020-09-30    -0.145121  1.437561e-03
104 2020-10-30    -0.505004  7.784050e-32
105 2020-11-30    -0.213576  4.612879e-06
106 2020-12-31    -0.056299  2.237001e-01
  • Result:

    • The time series plot of coefficients over the entire sample period reveals fluctuations in the predictive power of PRC_Ret(T12M1) over time.
    • There are periods where the coefficient is significantly positive, indicating periods where momentum strategies (buying past winners and selling past losers) would have performed well. Conversely, there are also periods with significantly negative coefficients, indicating short-term reversals where momentum strategies would underperform.
    • The presence of both significantly positive and negative coefficients over the sample indicates that the effectiveness of a momentum strategy can vary greatly over time. This variation could be influenced by market conditions, investor sentiment, economic factors, or other variables not captured by the model.
  • Economic Interpretation:

    • A positive coefficient implies that past positive returns are associated with higher future returns, which supports the momentum strategy. Conversely, a negative coefficient suggests that higher past returns are associated with lower future returns, which would contradict the momentum strategy. On specific dates, like in early 2012, we see positive, statistically significant coefficients. This implies that during these periods, a momentum strategy would have been effective—buying stocks with high past returns could have led to high future returns. There are periods, such as May 2012, where the coefficient is negative and significant, suggesting that a traditional momentum strategy would have been counterproductive. In such times, it could indicate a market reversal where past losers outperform past winners.
    • Economically, this analysis suggests that momentum investing is not universally effective across all periods. The strategy's effectiveness depends on the market environment, with some periods showing that past winners continue to outperform (momentum) and other periods indicating that past winners underperform (reversal).
  • Statistical Interpretation:

    • the variation in the significance and direction of the coefficient across different periods suggests that momentum's effectiveness is not stable over time. This instability makes it challenging to apply a one-size-fits-all momentum strategy across all market conditions.
  • Explanation to layperson:

    • Momentum investing, which involves buying stocks that have performed well in the past and selling those that have performed poorly, doesn't always work. Our analysis shows that sometimes, stocks that did well in the past year don't continue to do so in the following month. This pattern changes over time; in some months, momentum investing might work, while in others, it might not.
  • Stands out period:

    • Observations Across Time | time | value | | --- | --- | | 2020-02-28 | 0.230302 | | 2020-03-31 | -0.236381 |

    • There is a abrupt shift of the coefficient from 0.23 to -0.23 for 2020-02 and 2020-03, which corresponds to the melt down at the beginning of the COVID. This abrupt shift of coefficient may be due to that all stocks are suffering from a huge downturn in that period.

    • Significant periods, such as those with particularly high positive or negative coefficients, highlight times when momentum strategies either performed exceptionally well or poorly. These standout periods could be associated with specific market events or economic conditions that influenced investor behavior and market dynamics.

In conclusion, while momentum strategies can be part of an investor's toolkit, their application requires careful consideration of current and historical market conditions. There's no guarantee that past winners will continue their streak, and as seen, the strategy's effectiveness can change significantly over time.

d)¶

  • Alpha: It is the measure of an investment strategy's ability to beat the market, or its excess return independent of the market's movements. It is essential because it is supposed to represent the value added by a fund manager's skill. The interpretation of a single regression coefficient as alpha means looking at how much return the strategy can generate over and above the market return. Consistent positive alpha over time would indicate a strategy that consistently adds value beyond market performance.

  • Maximum Drawdown: This is a crucial risk measure because it captures the largest single drop from peak to trough in the value of a portfolio, providing a real sense of the worst-case scenario an investor might experience. Investors are often more concerned with the potential losses they might incur than with the volatility of returns, making MDD a critical factor in the evaluation process.

  • Sortino Ratio: As it focuses only on downside risk, the Sortino Ratio is particularly valuable for investors who are more concerned with the negative volatility of returns rather than overall volatility. This measure is significant in evaluating strategies that aim to minimize potential losses while still providing reasonable returns.

When it comes to the interpretation of the time-series of regression coefficients, such as alpha, the stability or variability can be quite telling. A stable, consistently positive alpha suggests that a manager has skill that is persistently contributing to outperformance. In contrast, large fluctuations in alpha might indicate changes in market conditions that the manager is either responding to effectively or being adversely affected by, or they could signal that what appeared to be skill was, in fact, luck. Viewing alpha across time allows investors to differentiate between a manager's skill and variance due to chance, helping to assess the long-term viability of an investment strategy.

8 a) univariate regressions¶

Price_Ret(T1); PRC; PRC_Ret(T12); Prices_Ret(T12); PRC_Ret(T12M1); Prices_Ret(T12M1); PRC_Ret(T12_1M); Prices_Ret(T12_1M); and SR_Prices_Ret(T12M1)¶

In [ ]:
def cross_sectional_regression(df, independent_var, dependent_var):
    dates = df.index.unique()
    coefficients = []
    p_values = []

    for period in dates:
        period_data = df.loc[period]
        
        X = sm.add_constant(period_data[independent_var])
        Y = period_data[dependent_var]
            
        model = sm.OLS(Y, X).fit()
        coefficients.append(model.params[independent_var])
        p_values.append(model.pvalues[independent_var])

    results_df = pd.DataFrame({'Date': dates, 'Coefficient': coefficients, 'P-Value': p_values})

    plt.figure(figsize=(8, 3))
    plt.plot(results_df['Date'], results_df['Coefficient'], label='Coefficient')
    plt.axhline(0, color='grey', lw=0.5) 
    plt.title(f'Time Series of {independent_var} Coefficients')
    plt.xlabel('Date')
    plt.ylabel('Coefficient')

    significant_periods = results_df[results_df['P-Value'] < 0.05]
    plt.scatter(significant_periods['Date'], significant_periods['Coefficient'], color='red', label='Statistically Significant (p < 0.05)')

    plt.legend(fontsize="8", loc ="upper left")
    plt.tight_layout()
    plt.show()
In [ ]:
cross_sectional_regression(all_df, 'Price_Ret(T1)', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'prc', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'PRC_Ret(T12)', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'Prices_Ret(T12)', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'PRC_Ret(T12M1)', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'Prices_Ret(T12M1)', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'PRC_Ret(T12_1M)', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'Prices_Ret(T12_1M)', 'PRC_Ret(F1M)')
cross_sectional_regression(all_df, 'SR_Prices_Ret(T12M1)', 'PRC_Ret(F1M)')
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Economically, I would not expect the sign of all the variables to be the same.

Survivorship Bias: Only including stocks that have survived through the entire period may bias the results.

Serial Correlation: Financial time series data often exhibit serial correlation, potentially inflating the significance of predictors.

8 b) Multivariate Regression:¶

i. Price_Ret(T1) and Prices_Ret(T12M1)

ii. PRC and PRC_Ret(T12M1)

iii. PRC and Prices_Ret(T12)

iv. PRC and SR_Prices_Ret(T12M1)

In [ ]:
def multivariate_cross_sectional_regression(df, independent_var_list, dependent_var):
    dates = df.index.unique()
    results = []

    for period in dates:
        period_data = df.loc[period]
        
        X = sm.add_constant(period_data[independent_var_list])
        Y = period_data[dependent_var]
            
        model = sm.OLS(Y, X).fit()
        for var in independent_var_list:
            results.append({
                'Date' : period,
                'Variable' : var,
                'Coefficient' : model.params.get(var, 0),
                'p-value' : model.pvalues.get(var, 1)
            })

    results_df = pd.DataFrame(results)

    fig, axs = plt.subplots(1, len(independent_var_list), figsize=(16, 3))
    
    for i, var in enumerate(independent_var_list):
        var_data = results_df[results_df['Variable'] == var]
        axs[i].plot(var_data['Date'], var_data['Coefficient'], label='Coefficient')
        axs[i].axhline(0, color='grey', lw=0.5)
        axs[i].set_title(f'Time Series of {var} Coefficients')
        axs[i].set_xlabel('Date')
        axs[i].set_ylabel('Coefficient')
        
        significant_periods = var_data[var_data['p-value'] < 0.05]
        axs[i].scatter(significant_periods['Date'], significant_periods['Coefficient'], color='red', label='Statistically Significant (p < 0.05)')
        
        axs[i].legend()
    
    plt.tight_layout()
    plt.show()
In [ ]:
multivariate_cross_sectional_regression(all_df_bfnorm, ['Price_Ret(T1)', 'Prices_Ret(T12M1)'], 'PRC_Ret(F1M)')

multivariate_cross_sectional_regression(all_df_bfnorm, ['prc', 'PRC_Ret(T12M1)'], 'PRC_Ret(F1M)')

multivariate_cross_sectional_regression(all_df_bfnorm, ['prc', 'Prices_Ret(T12)'], 'PRC_Ret(F1M)')

multivariate_cross_sectional_regression(all_df_bfnorm, ['prc', 'SR_Prices_Ret(T12M1)'], 'PRC_Ret(F1M)')
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
In [ ]:
multivariate_cross_sectional_regression(all_df, ['Price_Ret(T1)', 'Prices_Ret(T12M1)'], 'PRC_Ret(F1M)')

multivariate_cross_sectional_regression(all_df, ['prc', 'PRC_Ret(T12M1)'], 'PRC_Ret(F1M)')

multivariate_cross_sectional_regression(all_df, ['prc', 'Prices_Ret(T12)'], 'PRC_Ret(F1M)')

multivariate_cross_sectional_regression(all_df, ['prc', 'SR_Prices_Ret(T12M1)'], 'PRC_Ret(F1M)')
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
  • Economically, I would not expect the sign of all the variables to be the same for different factors in the same multivariate regression because each factor represents a different aspect of price behavior and investor sentiment.

    • Price_Ret(T1) and Prices_Ret(T12M1): Price_Ret(T1) reflects immediate past return, potentially capturing short-term movements which may exhibit a reversal effect in the subsequent period, whereas Prices_Ret(T12M1) indicates longer-term momentum, excluding the immediate past month. The economic expectation is that the short-term return might show a negative coefficient if prices revert to the mean, while the longer-term momentum should typically show a positive coefficient if past winners continue to perform well.
    • PRC and PRC_Ret(T12M1): PRC_Ret(T12M1) captures the momentum effect, which should have a positive coefficient, reflecting the tendency for past winners to continue yielding higher returns.
    • PRC and SR_Prices_Ret(T12M1): SR_Prices_Ret(T12M1) could be capturing a short-term reversal that may happen after a period of momentum. The expected sign might be negative, suggesting that after a period of sustained returns, prices might correct themselves.
  • Particularities with the Regressions
    • when running the regression on unnormalized data and the normalized data, we found that if we use unnormalized data, the scale of coefficients vary in different regressions. For example, the regression on prc and PRC_Ret(T12M1) has the scale of coefficient of 'PRC_Ret(T12M1)' between [-0.2,0.2], but the regression on prc and SR_Prices_Ret(T12M1) has the scale of coefficient of 'PRC_Ret(T12M1)' between [-0.015,0.015]. So we use the cross-sectional normalized data to run the rest of the regressions.
  • Particularities/Issues with the Regressions

    • Multicollinearity: Since some of the independent variables are different forms of price returns, they might be highly correlated with each other, leading to multicollinearity. This could inflate the variance of the coefficient estimates and make the results less reliable.

    • Time Variation: The coefficients fluctuate over time, which suggests that the relationship between the predictors and future returns is not stable. This could be due to changes in market efficiency, investor behavior, or macroeconomic conditions.

    • Autocorrelation: In time-series data, there could be autocorrelation that violates the regression assumptions and could lead to misleading inference if not properly addressed.

    • Risk Adjustments: The regressions might not account for risk factors that could explain returns. For instance, stocks with high momentum might also have higher risk, which is not captured in a simple regression.

9 Cross-sectional regression:¶

independent variables is PRC_Ret(T12M1)

dependent variables: PRC_Ret(F3M) and PRC_Ret(F6M).

In [ ]:
cross_sectional_regression(all_df, 'PRC_Ret(T12M1)', 'PRC_Ret(F3M)')
No description has been provided for this image
In [ ]:
cross_sectional_regression(all_df, 'PRC_Ret(T12M1)', 'PRC_Ret(F6M)')
No description has been provided for this image
  • observation on result:

    • Coefficient Fluctuations: The coefficients over time fluctuate around zero, which suggests that the predictive power of PRC_Ret(T12M1) on future returns varies over time and may not be consistent.

    • Statistical Significance: In both plots, red dots signify statistically significant coefficients at the 5% level. There appear to be many more statistically significant points for the F1M forward returns compared to F3M and F6M. This indicates that as the forward-looking period increases, the predictive power of PRC_Ret(T12M1) diminishes.

    • Trend in Significance: The trend of the coefficients does not seem to follow a clear pattern, which could suggest that the relationship between past returns and future returns is not stable over time and may be influenced by market conditions, behavioral factors, or other external variables.

  • interpretation: Given the observed reduction in statistically significant coefficients as the forward return period increases, we can deduce that momentum signals, specifically PRC_Ret(T12M1), are less predictive for longer-term returns. This might be explained by:

    • Market Efficiency: Markets may incorporate information more fully over longer periods, diluting the momentum effect.
    • Mean Reversion: Over longer periods, there's a higher chance for mean reversion, where stocks that performed well in the past may underperform as they revert to their long-term average.
  • Potential Econometric/Statistical Issues:

    • Non-Stationarity: Financial time series data may be non-stationary, particularly over different time horizons, which can affect the reliability of regression coefficients.
    • Serial Correlation: The presence of serial correlation in financial returns can lead to misestimation of the significance levels of the coefficients.
    • Changing Risk Premia: The risk-return tradeoff may change over time, which means the factors driving returns in one period might not be the same in another.
  • Correcting Issues:

    • Addressing Non-Stationarity: Use techniques like differencing, detrending, or transforming the series into stationary through logarithmic or percentage change transformations.
    • Accounting for Serial Correlation: Employ models that specifically adjust for time-series correlation, like the Generalized Method of Moments (GMM) or Newey-West standard errors.
    • Modeling Changing Risk Premia: Introduce additional variables to the regression that can account for changes in risk premia over time, or use rolling regression windows to capture the evolving relationships.
  • Conclusion:

    • The Fama-McBeth regression results, particularly the decrease in statistically significant coefficients for longer forward returns, suggest that the momentum effect identified by PRC_Ret(T12M1) diminishes over longer periods. This is in line with the economic theory of market efficiency and mean reversion over time. The reliability of these regression results can be improved by addressing the potential econometric and statistical issues mentioned, thus yielding more robust insights into the predictive power of momentum signals across different investment horizons.

10¶

  • Understanding of Momentum Signals:

    • The homework solidifies the concept of momentum in finance—that past returns can influence future performance. It's a deep dive into how quantitative researchers can identify and leverage historical data patterns to predict future market movements.
    • Learning how to create various momentum signals from raw data teaches the practical aspects of data handling, signal processing, and the nuances of different financial metrics. It’s crucial for developing any systematic trading strategy.
  • Econometric Skills

    • Cross-sectional Regression Analysis: Performing Fama-McBeth regressions teaches us how to explore the relationships between return predictors and actual returns across different assets at a given time.
  • Understanding Statistical Significance:

    • Through this exercise, we learn to critically evaluate the statistical significance of your findings, understanding not just when a signal appears to predict returns, but also gauging the reliability of these predictions.
  • Investment Strategy Insights

    • Efficacy of Momentum Strategies: By analyzing momentum [-12, -1] or any other specified periods, we gain insights into the effectiveness of momentum-based investment strategies. It helps quantify how past performance can be indicative of future returns.
  • Diverse Time Horizons:

    • Comparing short-term (F1M) with medium (F3M) and long-term (F6M) forward returns highlights how the predictive power of signals can vary over different investment horizons. This can inform the development of strategies tailored to specific time frames.
  • Innovation in Strategy Development:

    • This exercise also underlines the importance of continuous learning in developing investment strategies. The financial markets are ever-changing, and strategies that worked in the past may not work in the future. Hence, a quantitative researcher must always be testing, learning, and adapting.